Decision

Background

Overview

Corn, wheat and soybeans are the three major crops that are produced in the United States. In particularly, Midwest has the stable climate to grow the crops. Considering the limited farmland and seasonality, farmers have to think how to fully utilize this land to get the most out of it. Below is the summary of the planting and harvest for each crops. According to www.thebalance.com,

  • Corn – Plant: Apr - June – Harvest: Oct - Nov

  • Wheat – Plant: Apr - May – Harvest: Aug - Sept

  • Soybeans – Plant: Apr - June – Harvest: Sept - Nov

Business questions

  1. What is the nature of crops market in general and in particular for this data set?

  2. How would the performance of these commodities affect the timing of farming?

  3. How would we manage the allocation of existing resources given we have just landed in this new market?

Row

Stylized Facts of the Crops Market

  • Volatility is rarely constant and often has a structure (mean reversion) and is dependent on the past.

  • Extreme events are likely to happen with other extreme events.

  • Negative returns are more likely than positive returns (left skew).

  • Corn and wheat have a moderate correlation of 0.63.

  • Corn and soybeans have a slight weaker correlated than corn and wheat. The correlation value is 0.58.

  • Wheat and soybeans have fair correlation of 0.36.

  • There is a positive spillover effect from wheat into corn. In other words, the volatility of the corn market and wheat market are closely related.

  • We can sensitize our analysis with the range of upper and lower bounds on the parameter estimates of the relationship between correlation and volatility.

  • The log()-log() transformation allows us to interpret the regression coefficients as elasticities, which vary with the quantile. The larger the elasticity, especially if the absolute value is greater than one, the more risk dependence one market has on the other.

Data

Column

Data Definitions

  • Corn: daily corn price ($/per bushel)
  • Wheat: daily wheat prices ($/per bushel)
  • Soybeans: daily soybeans prices ($/per bushel)

Crops Price Percent Changes

Row

Initial Analysis of Corn, Wheat and Soybeans

Historical data 2013-2018

  • All crops have experienced a number of spikes in price and magnitude percentage change.
  • Factors that affect the corn price: ethanol effect, crude oil prices, investor/speculator effect, climate, Chinese effect, Geoplotical issues
  • Factors that affect the wheat price: economy, inflation, production outlook in top wheat producing countries, climate, corn/wheat spread
  • Factors that affect the soybeans price: economy, climate, US production, the corn effect, Brazil production, Chinese demand, global production outlook, GMO soybean effect

-Adapted source from http://www.futuresknowledge.com

Size of Crops Price Percent Changes

Exploratory Analysis

Row

Corn Loss Distribution

Wheat Loss Distribution

Soybeans Loss Distribution

Corn, Wheat, Soybeans

Volatility

Persistence 1

Persistence 2

Statistics

mean median std_dev IQR skewness kurtosis
corn -0.0226 0.0000 1.3365 1.5643 -0.0809 5.1907
wheat -0.0182 -0.1039 1.6862 2.1406 0.2823 3.6911
soybeans -0.0380 0.0000 1.2382 1.4473 -0.2258 5.2570

Market Risk

Row

Corn and Wheat (90 day rolling correlation)

Corn and Soybeans (90 day rolling correlation)

Wheat and Soybeans (90 day rolling correlation)

Correlation drill-down

row

Market Leverage: Corn volatility

pdf 
  2 

Market Spillover: Wheat into Corn

pdf 
  2 

Corn Dependency on Wheat

Optimization

row

Extreme frontier finance

Extreme portfolio risk measures

Markowitz efficient portfolio frontier

Portfolio Weights

The weights for the Markowitz tangency portfolio (“*“) are

    corn    wheat soybeans 
  0.3143   0.1700   0.5157 

The weights for the QR tangency portfolio are

[1] -6.62  2.59  5.03

For the working capital accounts:

  1. 100 million overall portfolio
  2. Buy 31 million in corn
  3. Buy 17 million in wheat
  4. Buy 52 million in soybeans

Risk Accessment

We need to know how much cash are needed to support the risk tolerance and threshold. To accomplish that, we set that the organization’s tolerance for risk at 5%: that is, a maximum of 5% of the time could losses exceed 10%. The organization also sets the collateral at 2% and a 25% standard deviation. Then we need to solve w in the equation:

\[ z = \frac{- 0.1 - 0.1w - 0.02 ( 1 - w )}{0.25w} \]

or

\[ NormalInverse [ Normal ( \frac{- 0.12 - 0.12w}{0.25w} )] = NormalInverse(0.05) \]

We calculate NormalInverse(0.05) by performing
qnorm(0.05)
[1] -1.645
then, to solve w:
risky <- -0.12/(0.25*(-1.64)+0.12)
paste('The weight invested in the risky contract', risky)
[1] "The weight invested in the risky contract 0.413793103448276"

Therefore, the risky contract value = 42% of the portfolio value.

Loss measurement

Empirical loss

Extremes

Row

Mean excess loss

The upper red line and the lower red line forms a boundary to tell us what is happening to our exceedences. The middle, black line is the shape of parameter at various threshold and their corresponding exceedences. ### GPD fits and starts

Confidence and risk measures

Insights

row

Key Insights

  1. All crops have experienced spikes in price and magnitude percentage change. Common shared factors are climate and competitors. Corn vs wheat and corn vs soybeans have moderate positive correlation (corn vs wheat is 0.63; corn vs soybeans is 0.58.

  2. The value at risk of the crops mix would be matched to the market performance ergo value of the commodities. The timing of the farming directly correlates to the proportionate projected volatility of the crops mix. It shows that certain mixes of crops and will provide us with less volatility and more value for our risk associated. Corn, wheat and soybeans all have a very low mean. They also have smaller standard deviations which proves to be more consistent and at less risk when developing what will be necessary to achieve optimal size and mix.

  3. The working capital account of $100 million should be allocated as follows: buy $31 million in corns, $17 million in wheat, $52 million in soybeans.

Row

References

R Packages used

ggplot, scales, quadprog, quantreg, shiny, flexdashboard, qrmdata, xts, matrixStats, zoo, QRM, plotly, and psych

References - Github and Stack Overflow for trouble shooting

---
title: "Final Project"
author: Teng Siong "T.S" Yeap
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    source_code: embed
    theme: "bootstrap"

---

```{r, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
rm(list = ls())
library(ggplot2)
library(flexdashboard)
library(shiny)
library(QRM)
library(qrmdata)
library(xts)
library(zoo)
library(plotly)
#library(ggfortify)
library(psych)
library(matrixStats)
library(moments)
library(quantreg)
library(quadprog)
library(scales)

# PAGE: Exploratory Analysis
data <- na.omit(read.csv("cropsdata.csv", header = TRUE))
prices <- data
# Compute log differences percent using as.matrix to force numeric type
data.r <- diff(log(as.matrix(data[, -1]))) * 100
# Create size and direction
size <- na.omit(abs(data.r)) # size is indicator of volatility
#head(size)
colnames(size) <- paste(colnames(size),".size", sep = "") # Teetor
direction <- ifelse(data.r > 0, 1, ifelse(data.r < 0, -1, 0)) # another indicator of volatility
colnames(direction) <- paste(colnames(direction),".dir", sep = "")
# Convert into a time series object: 
# 1. Split into date and rates
dates <- as.Date(data$DATE[-1], "%m/%d/%Y")
dates.chr <- as.character(data$DATE[-1])
#str(dates.chr)
values <- cbind(data.r, size, direction)
# for dplyr pivoting and ggplot2 need a data frame also known as "tidy data"
data.df <- data.frame(dates = dates, returns = data.r, size = size, direction = direction)
data.df.nd <- data.frame(dates = dates.chr, returns = data.r, size = size, direction = direction, stringsAsFactors = FALSE) 
#non-coerced dates for subsetting on non-date columns
# 2. Make an xts object with row names equal to the dates
data.xts <- na.omit(as.xts(values, dates)) #order.by=as.Date(dates, "%d/%m/%Y")))
#str(data.xts)
data.zr <- as.zooreg(data.xts)
returns <- data.xts # watch for this data below!

# PAGE: Market risk 
corr_rolling <- function(x) {	
  dim <- ncol(x)	
  corr_r <- cor(x)[lower.tri(diag(dim), diag = FALSE)]	
  return(corr_r)	
}
vol_rolling <- function(x){
  library(matrixStats)
  vol_r <- colSds(x)
  return(vol_r)
}
ALL.r <- data.xts[, 1:3]
window <- 90 #reactive({input$window})
corr_r <- rollapply(ALL.r, width = window, corr_rolling, align = "right", by.column = FALSE)
colnames(corr_r) <- c("corn.wheat", "corn.soybeans", "wheat.soybeans")
vol_r <- rollapply(ALL.r, width = window, vol_rolling, align = "right", by.column = FALSE)
colnames(vol_r) <- c("corn.vols", "wheat.vols", "soybeans.vols")
year <- format(index(corr_r), "%Y")
r_corr_vol <- merge(ALL.r, corr_r, vol_r, year)
# Market dependencies
#library(matrixStats)
R.corr <- apply.monthly(as.xts(ALL.r), FUN = cor)
R.vols <- apply.monthly(ALL.r, FUN = colSds) # from MatrixStats	
# Form correlation matrix for one month 	
R.corr.1 <- matrix(R.corr[20,], nrow = 3, ncol = 3, byrow = FALSE)	
rownames(R.corr.1) <- colnames(ALL.r[,1:3])	
colnames(R.corr.1) <- rownames(R.corr.1)	
R.corr.1
R.corr <- R.corr[, c(2, 3, 6)]
colnames(R.corr) <- c("corn.wheat", "corn.soybeans", "wheat.soybeans")
colnames(R.vols) <- c("corn.vols", "wheat.vols", "soybeans.vols")	
R.corr.vols <- na.omit(merge(R.corr, R.vols))
year <- format(index(R.corr.vols), "%Y")
R.corr.vols.y <- data.frame(corn.correlation = R.corr.vols[,1], wheat.volatility = R.corr.vols[,5], year = year)
corn.vols <- as.numeric(R.corr.vols[,"corn.vols"])	
wheat.vols <- as.numeric(R.corr.vols[,"wheat.vols"])	
soybeans.vols <- as.numeric(R.corr.vols[,"soybeans.vols"])

library(quantreg)
taus <- seq(.05,.95,.05)	# Roger Koenker UI Bob Hogg and Allen Craig
fit.rq.corn.wheat <- rq(log(corn.wheat) ~ log(wheat.vols), tau = taus, data = r_corr_vol)	
fit.lm.corn.wheat <- lm(log(corn.wheat) ~ log(wheat.vols), data = r_corr_vol)	
#' Some test statements	
corn.wheat.summary <- summary(fit.rq.corn.wheat, se = "boot")
```
Decision
=======================================================================

Background
-----------------------------------------------------------------------
### Overview
Corn, wheat and soybeans are the three major crops that are produced in the United States. In particularly, Midwest has the stable climate to grow the crops. Considering the limited farmland and seasonality, farmers have to think how to fully utilize this land to get the most out of it. Below is the summary of the planting and harvest for each crops. According to www.thebalance.com,

- Corn -- **Plant**: Apr - June -- **Harvest**: Oct - Nov

- Wheat -- **Plant**: Apr - May -- **Harvest**: Aug - Sept

- Soybeans -- **Plant**: Apr - June -- **Harvest**: Sept - Nov

####

###Business questions

1. What is the nature of crops market in general and in particular for this data set? 

2. How would the performance of these commodities affect the timing of farming?

3.	How would we manage the allocation of existing resources given we have just landed in this new market?

Row
-----------------------------------------------------------------------
###Stylized Facts of the Crops Market
- Volatility is rarely constant and often has a structure (mean reversion) and is dependent on the past.

- Extreme events are likely to happen with other extreme events.

- Negative returns are more likely than positive returns (left skew).

- Corn and wheat have a moderate correlation of 0.63.

- Corn and soybeans have a slight weaker correlated than corn and wheat. The correlation value is 0.58.

- Wheat and soybeans have fair correlation of 0.36.

- There is a positive spillover effect from wheat into corn. In other words, the volatility of the corn market and wheat market are closely related. 

- We can sensitize our analysis with the range of upper and lower bounds on the parameter estimates of the relationship between correlation and volatility.

- The log()-log() transformation allows us to interpret the regression coefficients as elasticities, which vary with the quantile. The larger the elasticity, especially if the absolute value is greater than one, the more risk dependence one market has on the other.

Data
=======================================================================

Column
-----------------------------------------------------------------------

### Data Definitions

- **Corn**: daily corn price (\$/per bushel)
- **Wheat**: daily wheat prices (\$/per bushel)
- **Soybeans**: daily soybeans prices (\$/per bushel)

### Crops Price Percent Changes
```{r}
p <- autoplot.zoo(data.xts[,1:3]) # + ggtitle(title.chg1) #+ ylim(-5, 5)
ggplotly(p)
```


Row 
-----------------------------------------------------------------------

### Initial Analysis of Corn, Wheat and Soybeans

Historical data 2013-2018

- All crops have experienced a number of spikes in price and magnitude percentage change.  
- Factors that affect the corn price: ethanol effect, crude oil prices, investor/speculator effect, climate, Chinese effect, Geoplotical issues
- Factors that affect the wheat price: economy, inflation, production outlook in top wheat producing countries, climate, corn/wheat spread
- Factors that affect the soybeans price: economy, climate, US production, the corn effect, Brazil production, Chinese demand, global production outlook, GMO soybean effect

-Adapted source from http://www.futuresknowledge.com

### Size of Crops Price Percent Changes

```{r}
p <- autoplot.zoo(abs(data.xts[,4:6])) # + ggtitle(title.chg2) #+ ylim(-5, 5)
ggplotly(p)
```

Exploratory Analysis
=======================================================================

Row {.tabset .tabset-fade}
-----------------------------------------------------------------------

###Corn Loss Distribution

```{r}
returns1 <- returns[,1]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
  
alpha <- 0.95 
  
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
  
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
  
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
  
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
    annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text) + scale_fill_manual( values = "dodgerblue4")
p
```

###Wheat Loss Distribution

```{r}
returns1 <- returns[,2]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
  
alpha <- 0.95
  
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
  
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
  
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
  
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
    annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text, size = 3.5) + scale_fill_manual( values = "dodgerblue4")
p
```

###Soybeans Loss Distribution

```{r}
returns1 <- returns[,3]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
  
alpha <- 0.95 
  
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
  
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
  
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
  
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
    annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text, size = 3) + scale_fill_manual( values = "dodgerblue4")
p
```

### Corn, Wheat, Soybeans

```{r}
p <- autoplot.zoo(ALL.r)
ggplotly(p)
```

### Volatility

```{r}
p <- autoplot.zoo(abs(ALL.r))
ggplotly(p)
```


### Persistence 1

```{r }
acf(coredata(data.xts[,1:3])) # returns
```

### Persistence 2

```{r}
acf(coredata(data.xts[,4:5])) # sizes
```

### Statistics
```{r}
## data_moments function
## INPUTS: r vector
## OUTPUTS: list of scalars (mean, sd, median, skewness, kurtosis)
data_moments <- function(data){
  library(moments)
  library(matrixStats)
  mean.r <- colMeans(data)
  median.r <- colMedians(data)
  sd.r <- colSds(data)
  IQR.r <- colIQRs(data)
  skewness.r <- skewness(data)
  kurtosis.r <- kurtosis(data)
  result <- data.frame(mean = mean.r, median = median.r, std_dev = sd.r, IQR = IQR.r, skewness = skewness.r, kurtosis = kurtosis.r)
  return(result)
}
# Run data_moments()
answer <- data_moments(data.xts[, 1:3])
# Build pretty table
answer <- round(answer, 4)
knitr::kable(answer)
```

Market Risk
=======================================================================

Row {.tabset}
-----------------------------------------------------------------------

### Corn and Wheat (90 day rolling correlation)

```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = corn.wheat)) + geom_line()
ggplotly(p)
```

### Corn and Soybeans (90 day rolling correlation)

```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = corn.soybeans)) + geom_line()
ggplotly(p)
```

### Wheat and Soybeans (90 day rolling correlation)

```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = wheat.soybeans)) + geom_line()
ggplotly(p)
```

###Correlation drill-down

```{r}
#library(psych)
ALL_df <- data.frame(corn=ALL.r[,1], wheat=ALL.r[,2], soybeans=ALL.r[,3])
pairs.panels(ALL_df)
```


row {.tabset }
-----------------------------------------------------------------------

### Market Leverage: Corn volatility

```{r}
library(magick)
img <- image_graph(res = 96)
datalist <- split(r_corr_vol, r_corr_vol$year)
out <- lapply(datalist, function(data){
  p <- ggplot(data, aes(corn.vols, corn)) +
    geom_point() + 
    ggtitle(data$year) + 
    geom_quantile(quantiles = c(0.05, 0.95)) + 
    geom_quantile(quantiles = 0.5, linetype = "longdash") +
    geom_density_2d(colour = "red")  
  print(p)
})
dev.off()
#img <- image_background(image_trim(img), 'white')
animation <- image_animate(img, fps = .5)
animation
```


### Market Spillover: Wheat into Corn

```{r}
library(magick)
img <- image_graph(res = 96)
datalist <- split(r_corr_vol, r_corr_vol$year)
out <- lapply(datalist, function(data){
  p <- ggplot(data, aes(wheat.vols, corn.wheat)) +
    geom_point() + 
    ggtitle(data$year) + 
    geom_quantile(quantiles = c(0.05, 0.95)) + 
    geom_quantile(quantiles = 0.5, linetype = "longdash") +
    geom_density_2d(colour = "red")  
  print(p)
})
dev.off()
#img <- image_background(image_trim(img), 'white')
animation <- image_animate(img, fps = .5)
animation
```


### Corn Dependency on Wheat

```{r}
plot(summary(fit.rq.corn.wheat), parm = "log(wheat.vols)", main = "Corn-Wheat correlation sensitivity to Wheat volatility", xlab = "quantiles", ylab = "sensitivity")
```

Optimization
=======================================================================

row {.tabset}
----------------------------------------------------------------------
```{r, include=FALSE}
library(quantreg)
x <- data.r/100
n <- nrow(x)
p <- ncol(x)
alpha <-  c(0.1, 0.3) # quantiles
w <-  c(0.3, 0.7) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha)
# error handling: if (length(w) != m) stop("length of w doesn't match length of alpha")
xbar <- apply(x, 2, mean)
mu.0 <-  mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1]
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1]))
R <- cbind(matrix(0, nrow(R), m), R)
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r)
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)]) 
x <- as.matrix(x)
yhat <- x %*% pihat # predicted 
etahat <- quantile(yhat, alpha)
muhat <- mean(yhat)
qrisk <- 0
for (i in 1:length(alpha)) qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
qrisk
pihat
```


```{r, include=FALSE}
library(quantreg)
#library(dplyr) # use data.df now
alpha <- 0.95
u <- quantile(data.df$returns.corn, alpha )
x <- data.df.nd[data.df.nd$returns.corn < u, 2:4]/100
n <- nrow(x)
p <- ncol(x)
alpha <-  c(0.01, 0.1) # quantiles at lower (negative) tail
w <-  c(0.95, 0.05) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha) #alpha and w length must be the same
xbar <- apply(x, 2, mean)
mu.0 <-  mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1] # set up design matrix of adjusted all but numeraire returns
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1])) # constraints
R <- cbind(matrix(0, nrow(R), m), R) #augmented constraints
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r) #Bob Hogg estimator
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)]) 
x <- as.matrix(x)
yhat <- x %*% pihat # predicted 
(etahat <- quantile(yhat, alpha))
(muhat <- mean(yhat))
qrisk <- 0
for (i in 1:length(alpha)) qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
qrisk
pihat
```

### Extreme frontier finance

```{r }
mu.0 <- xbar
mu.P <- seq(-.0005, 0.0015, length = 100) ## set of 300 possible target portfolio returns
qrisk.P <-  mu.P ## set up storage for quantile risks of portfolio returns
weights <-  matrix(0, nrow=300, ncol = ncol(data.r)) ## storage for portfolio weights
colnames(weights) <- names(data.r)
for (i in 1:length(mu.P))
{
  mu.0 <-  mu.P[i]  ## target returns
  result <- qrisk(x, mu = mu.0)
  qrisk.P[i] <- -result$qrisk # convert to loss risk already weighted across alphas
  weights[i,] <-  result$pihat
}
qrisk.mu.df <- data.frame(qrisk.P = qrisk.P, mu.P = mu.P )
mu.P <- qrisk.mu.df$mu.P
mu.free <-  0.0003 ## input value of risk-free interest rate
sharpe <- ( mu.P-mu.free)/qrisk.P ## compute Sharpe's ratios
ind <-  (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <-  (qrisk.P == min(qrisk.P)) ## find the minimum variance portfolio
ind3 <-  (mu.P > mu.P[ind2]) ## find the efficient frontier (blue)
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
weights.extr <- weights[ind,] # for use in calculating tengency risk measures
qrisk.mu.df$col.P <- col.P

  p <- ggplot(qrisk.mu.df, aes(x = qrisk.P, y = mu.P, group = 1)) + geom_line(aes(colour= col.P, group = col.P)) + scale_colour_identity()  
  p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
  options(digits=3)
  p <- p + geom_abline(intercept = mu.free, slope = (mu.P[ind]-mu.free)/qrisk.P[ind], colour = "red")
  p <- p + geom_point(aes(x = qrisk.P[ind], y = mu.P[ind])) 
  p <- p + geom_point(aes(x = qrisk.P[ind2], y = mu.P[ind2])) 
  ggplotly(p)

```

### Extreme portfolio risk measures

```{r }
price.last <- as.numeric(tail(prices[, 2:4], n=1))
# Specify the positions
position.rf <- weights.extr #c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these  the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance, names = FALSE)
fit.extr <- fit.GPD(loss.rf, threshold = u)

showRM(fit.extr, alpha = alpha.tolerance, RM = "ES", method = "BFGS")

```

### Markowitz efficient portfolio frontier

```{r }
R <- returns[,1:3]/100
names.R <- colnames(R)
mean.R <-  apply(R,2,mean)
cov.R <-  cov(R)
sd.R <-  sqrt(diag(cov.R)) ## remember these are in daily percentages
#library(quadprog)
Amat <-  cbind(rep(1,3),mean.R)  ## set the equality constraints matrix
mu.P <- seq(-.0005, 0.0015, length = 100)  ## set of 300 possible target portfolio returns
sigma.P <-  mu.P ## set up storage for std dev's of portfolio returns
weights <-  matrix(0, nrow=300, ncol = ncol(R)) ## storage for portfolio weights
colnames(weights) <- names.R
for (i in 1:length(mu.P))
{
  bvec <- c(1,mu.P[i])  ## constraint vector
  result <- solve.QP(Dmat=2*cov.R,dvec=rep(0,3),Amat=Amat,bvec=bvec,meq=2)
  sigma.P[i] <- sqrt(result$value)
  weights[i,] <- result$solution
}
sigma.mu.df <- data.frame(sigma.P = sigma.P, mu.P = mu.P )
mu.free <-  .0003 ## input value of risk-free interest rate
sharpe <- ( mu.P-mu.free)/sigma.P ## compute Sharpe's ratios
ind <-  (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <-  (sigma.P == min(sigma.P)) ## find the minimum variance portfolio
ind3 <-  (mu.P > mu.P[ind2]) ## finally the efficient frontier
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
sigma.mu.df$col.P <- col.P
p <- ggplot(sigma.mu.df, aes(x = sigma.P, y = mu.P, group = 1)) + geom_line(aes(colour=col.P, group = col.P)) + scale_colour_identity() # + xlim(0, max(sd.R*1.1))  + ylim(0, max(mean.R)*1.1) + 
p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
options(digits=4)
p <- p + geom_abline(intercept = mu.free, slope = (mu.P[ind]-mu.free)/sigma.P[ind], colour = "red")
p <- p + geom_point(aes(x = sigma.P[ind], y = mu.P[ind])) 
p <- p + geom_point(aes(x = sigma.P[ind2], y = mu.P[ind2])) ## show min var portfolio
p <- p + annotate("text", x = sd.R[1], y = mean.R[1], label = names.R[1]) + annotate("text", x = sd.R[2], y = mean.R[2], label = names.R[2]) + annotate("text", x = sd.R[3], y = mean.R[3], label = names.R[3])
ggplotly(p)
```

### Portfolio Weights

The weights for the Markowitz tangency portfolio ("*") are

```{r echo = FALSE}
library(scales)
weights[ind2,][1,]
name <- colnames(weights)
posn <- ifelse((weights[ind,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind,]))
```

The weights for the QR tangency portfolio are

```{r echo = FALSE}
library(scales)
weights.extr[1,]
```

For the working capital accounts:

1. 100 million overall portfolio
2. Buy 31 million in corn
3. Buy 17 million in wheat
4. Buy 52 million in soybeans

###Risk Accessment
We need to know how much cash are needed to support the risk tolerance and threshold. To accomplish that, we set that the organization's tolerance for risk at 5%: that is, a maximum of 5% of the time could losses exceed 10%. The organization also sets the collateral at 2% and a 25% standard deviation. Then we need to solve w in the equation:

$$
z = \frac{- 0.1 - 0.1w - 0.02 ( 1 - w )}{0.25w}
$$

or

$$
NormalInverse [ Normal ( \frac{- 0.12 - 0.12w}{0.25w} )] = NormalInverse(0.05)
$$

We calculate NormalInverse(0.05) by performing
```{r, echo=TRUE}
qnorm(0.05)
```
then, to solve w:
```{r, echo=TRUE}
risky <- -0.12/(0.25*(-1.64)+0.12)
paste('The weight invested in the risky contract', risky)
```
Therefore, the risky contract value = 42% of the portfolio value.


Loss measurement
=======================================================================
### Empirical loss

```{r}
# Get last prices
price.last <- as.numeric(tail(prices[, 2:4], n=1))
# Specify the positions
position.rf <- c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these  the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
loss.rf.df <- data.frame(Loss = loss.rf, Distribution = rep("Historical", each = length(loss.rf)))
## Simple Value at Risk and Expected Shortfall
alpha.tolerance <- .95
VaR.hist <- quantile(loss.rf, probs=alpha.tolerance, names=FALSE)
## Just as simple Expected shortfall
ES.hist <- median(loss.rf[loss.rf > VaR.hist])
VaR.text <- paste("Value at Risk =\n", round(VaR.hist, 2)) # ="VaR"&c12
ES.text <- paste("Expected Shortfall \n=", round(ES.hist, 2))
title.text <- paste(round(alpha.tolerance*100, 0), "% Loss Limits")
p <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) + geom_histogram(alpha = 0.8) + geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "blue") + geom_vline(aes(xintercept = ES.hist), size = 1, color = "blue") + annotate("text", x = VaR.hist, y = 40, label = VaR.text) + annotate("text", x = ES.hist, y = 20, label = ES.text) + xlim(0, 0.4) + ggtitle(title.text)
ggplotly(p)
```

Extremes
=======================================================================
Row {.tabset}
-----------------------------------------------------------------------

### Mean excess loss

```{r}
data <- as.vector(loss.rf) # data is purely numeric
umin <-  min(data)         # threshold u min
umax <-  max(data) - 0.1   # threshold u max
nint <- 100                # grid length to generate mean excess plot
grid.0 <- numeric(nint)    # grid store
e <- grid.0                # store mean exceedances e
upper <- grid.0            # store upper confidence interval
lower <- grid.0            # store lower confidence interval
u <- seq(umin, umax, length = nint) # threshold u grid
alpha <- 0.95                  # confidence level
for (i in 1:nint) {
    data <- data[data > u[i]]  # subset data above thresholds
    e[i] <- mean(data - u[i])  # calculate mean excess of threshold
    sdev <- sqrt(var(data))    # standard deviation
    n <- length(data)          # sample size of subsetted data above thresholds
    upper[i] <- e[i] + (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # upper confidence interval
    lower[i] <- e[i] - (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # lower confidence interval
  }
mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
loss.excess <- loss.rf[loss.rf > u]
p <- ggplot(mep.df, aes( x= threshold, y = threshold.exceedances)) + geom_line() + geom_line(aes(x = threshold, y = lower), colour = "red") + geom_line(aes(x = threshold,  y = upper), colour = "red") + annotate("text", x = 0.1, y = 0.07, label = "upper 95%") + annotate("text", x = 0.1, y = 0.01, label = "lower 5%")
ggplotly(p)
```

####

The upper red line and the lower red line forms a boundary to tell us what is happening to our exceedences. The middle, black line is the shape of parameter at various threshold and their corresponding exceedences.
### GPD fits and starts

```{r}
data <- as.vector(loss.rf) # data is purely numeric
umin <-  min(data)         # threshold u min
umax <-  max(data) - 0.1   # threshold u max
nint <- 100                # grid length to generate mean excess plot
grid.0 <- numeric(nint)    # grid store
e <- grid.0                # store mean exceedances e
upper <- grid.0            # store upper confidence interval
lower <- grid.0            # store lower confidence interval
u <- seq(umin, umax, length = nint) # threshold u grid
alpha <- 0.95                  # confidence level
for (i in 1:nint) {
    data <- data[data > u[i]]  # subset data above thresholds
    e[i] <- mean(data - u[i])  # calculate mean excess of threshold
    sdev <- sqrt(var(data))    # standard deviation
    n <- length(data)          # sample size of subsetted data above thresholds
    upper[i] <- e[i] + (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # upper confidence interval
    lower[i] <- e[i] - (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # lower confidence interval
  }
mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
loss.excess <- loss.rf[loss.rf > u]
#library(QRM)
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance , names=FALSE)
fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
xi.hat <- fit$par.ests[["xi"]] # fitted xi
beta.hat <- fit$par.ests[["beta"]] # fitted beta
data <- loss.rf
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
# Plot away
VaRgpd.text <- paste("GPD: Value at Risk =", round(VaR.gpd, 2))
ESgpd.text <- paste("Expected Shortfall =", round(ES.gpd, 2))
title.text <- paste(VaRgpd.text, ESgpd.text, sep = " ")
loss.plot <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) + geom_density(alpha = 0.2)
  loss.plot <- loss.plot + geom_vline(aes(xintercept = VaR.gpd), colour = "blue", linetype = "dashed", size = 0.8)
  loss.plot <- loss.plot + geom_vline(aes(xintercept = ES.gpd), colour = "blue", size = 0.8) 
  #+ annotate("text", x = 300, y = 0.0075, label = VaRgpd.text, colour = "blue") + annotate("text", x = 300, y = 0.005, label = ESgpd.text, colour = "blue")
loss.plot <- loss.plot + xlim(0,0.4) + ggtitle(title.text)
  ggplotly(loss.plot)
```

### Confidence and risk measures

```{r}
showRM(fit, alpha = 0.95, RM = "ES", method = "BFGS")
```


Insights
=======================================================================

row {.tabset}
-------------------------------------------------------------------------
### Key Insights
1. All crops have experienced spikes in price and magnitude percentage change. Common shared factors are climate and competitors. Corn vs wheat and corn vs soybeans have moderate positive correlation (corn vs wheat is 0.63; corn vs soybeans is 0.58. 

2. The value at risk of the crops mix would be matched to the market performance ergo value of the commodities. The timing of the farming directly correlates to the proportionate projected volatility of the crops mix. It shows that certain mixes of crops and will provide us with less volatility and more value for our risk associated. Corn, wheat and soybeans all have a very low mean. They also have smaller standard deviations which proves to be more consistent and at less risk when developing what will be necessary to achieve optimal size and mix.

3. The working capital account of \$100 million should be allocated as follows: buy \$31 million in corns, \$17 million in wheat, $52 million in soybeans.


Row
------------------------------------------------------------------------
### References

**R Packages used**

ggplot, scales, quadprog, quantreg, shiny, flexdashboard, qrmdata, xts, matrixStats, zoo, QRM, plotly, and psych


**References**
- Github and Stack Overflow for trouble shooting

- Brian Peirce for the help in brainstorming business questions

- www.thebalance.com 

- www.macrotrend.net 

- https://turing.manhattan.edu/~wfoote01/finalytics/_site/index.html

- https://bookdown.org/wfoote01/faur/

- http://www.futuresknowledge.com